Checkpointing with mutable checkpoints
نویسندگان
چکیده
There are two approaches to reduce the overhead associated with coordinated checkpointing: )rst is to minimize the number of synchronization messages and the number of checkpoints; the other is to make the checkpointing process non-blocking. In our previous work (IEEE Parallel Distributed Systems 9 (12) (1998) 1213), we proved that there does not exist a non-blocking algorithm which forces only a minimum number of processes to take their checkpoints. In this paper, we present a min-process algorithm which relaxes the non-blocking condition while tries to minimize the blocking time, and a non-blocking algorithm which relaxes the min-process condition while minimizing the number of checkpoints saved on the stable storage. The proposed non-blocking algorithm is based on the concept of “mutable checkpoint”, which is neither a tentative checkpoint nor a permanent checkpoint. Based on mutable checkpoints, our non-blocking algorithm avoids the avalanche e7ect and forces only a minimum number of processes to take their checkpoints on the stable storage. c © 2002 Elsevier Science B.V. All rights reserved.
منابع مشابه
Low - Cost Checkpointing with Mutable Checkpoints in MobileComputing
Mobile computing raises many new issues, such as lack of stable storage, low bandwidth of wireless channel, high mobility, and limited battery life. These new issues make traditional checkpointing algorithms unsuitable. In this paper, we introduce the concept of \muta-ble checkpoint", which is neither a tentative checkpoint nor a permanent checkpoint. Mutable checkpoints can be saved any where;...
متن کاملReal Time Snapshot Collection Algorithm for Mobile Distributed Systems with Minimum Number of Checkpoints
Checkpointing is an efficient way of implementing fault tolerance in distributed systems. Mobile computing raises many new issues, such as high mobility, lack of stable storage on mobile hosts (MHs), low bandwidth of wireless channels, limited battery life and disconnections that make the traditional checkpointing protocols unsuitable for such systems. Minimum process non-blocking coordinated c...
متن کاملHandling Recurrent Failures in Coordinated Checkpointing for Mobile Distributed Systems
We propose a minimum-process coordinated checkpointing algorithm for non-deterministic mobile distributed systems, where no useless checkpoints are taken. An effort has been made to minimize the blocking of processes and synchronization message overhead. We capture the partial transitive dependencies during the normal execution by piggybacking dependency vectors onto computation messages. Frequ...
متن کاملSoft-Checkpointing Based Hybrid Synchronous Checkpointing Protocol for Mobile Distributed Systems
Minimum-process coordinated checkpointing is a suitable approach to introduce fault tolerance in mobile distributed systems transparently. In order to balance the checkpointing overhead and the loss of computation on recovery, the authors propose a hybrid checkpointing algorithm, wherein an all-process coordinated checkpoint is taken after the execution of minimum-process coordinated checkpoint...
متن کاملAn Enhanced MSS-based checkpointing Scheme for Mobile Computing Environment
Mobile computing systems are made up of different components among which Mobile Support Stations (MSSs) play a key role. This paper proposes an efficient MSS-based non-blocking coordinated checkpointing scheme for mobile computing environment. In the scheme suggested nearly all aspects of checkpointing and their related overheads are forwarded to the MSSs and as a result the workload of Mobile ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Theor. Comput. Sci.
دوره 290 شماره
صفحات -
تاریخ انتشار 2003